Skip to content

Conversation

@zhengjun-xing
Copy link

When running benchmarks with a large number of copies, the process may raise:
OSError: [Errno 24] Too many open files.

Example command:
(fbgemm_gpu_env)$ ulimit -n 1048576
(fbgemm_gpu_env)$ python ./bench/tbe/tbe_inference_benchmark.py nbit-cpu
--num-embeddings=40000000 --bag-size=2 --embedding-dim=96
--batch-size=162 --num-tables=8 --weights-precision=int4
--output-dtype=fp32 --copies=96 --iters=30000

PyTorch multiprocessing provides two shared-memory strategies: 1.file_descriptor (default)
2.file_system

The default file_descriptor strategy uses file descriptors as shared memory handles, which can result in a large number of open FDs when many tensors are shared.
If the total number of open FDs exceeds the system limit and cannot be raised, the file_system strategy should be used instead.

This patch allows switching to the file_system strategy by setting:
export PYTORCH_SHARE_STRATEGY='file_system'

Reference:
https://pytorch.org/docs/stable/multiprocessing.html#sharing-strategies

When running benchmarks with a large number of copies, the process may raise:
 OSError: [Errno 24] Too many open files.

Example command:
(fbgemm_gpu_env)$ ulimit -n 1048576
(fbgemm_gpu_env)$ python ./bench/tbe/tbe_inference_benchmark.py nbit-cpu
\
    --num-embeddings=40000000 --bag-size=2 --embedding-dim=96 \
    --batch-size=162 --num-tables=8 --weights-precision=int4 \
    --output-dtype=fp32 --copies=96 --iters=30000

PyTorch multiprocessing provides two shared-memory strategies:
1.file_descriptor (default)
2.file_system

The default file_descriptor strategy uses file descriptors as shared
memory handles, which can result in a large number of open FDs when many
tensors are shared.
If the total number of open FDs exceeds the system limit and cannot be
raised, the file_system strategy should be used instead.

This patch allows switching to the file_system strategy by setting:
  export PYTORCH_SHARE_STRATEGY='file_system'

Reference:
https://pytorch.org/docs/stable/multiprocessing.html#sharing-strategies
@netlify
Copy link

netlify bot commented Oct 21, 2025

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit 0135160
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68f72f2c23f2cb0008303d1f
😎 Deploy Preview https://deploy-preview-5037--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@meta-cla meta-cla bot added the cla signed label Oct 21, 2025
@meta-codesync
Copy link
Contributor

meta-codesync bot commented Nov 3, 2025

@q10 has imported this pull request. If you are a Meta employee, you can view this in D86135817.

q10 pushed a commit to q10/FBGEMM that referenced this pull request Nov 4, 2025
…ytorch#5037)

Summary:
X-link: facebookresearch/FBGEMM#2089

When running benchmarks with a large number of copies, the process may raise:
 OSError: [Errno 24] Too many open files.

Example command:
(fbgemm_gpu_env)$ ulimit -n 1048576
(fbgemm_gpu_env)$ python ./bench/tbe/tbe_inference_benchmark.py nbit-cpu \
    --num-embeddings=40000000 --bag-size=2 --embedding-dim=96 \
    --batch-size=162 --num-tables=8 --weights-precision=int4 \
    --output-dtype=fp32 --copies=96 --iters=30000

PyTorch multiprocessing provides two shared-memory strategies: 1.file_descriptor (default)
2.file_system

The default file_descriptor strategy uses file descriptors as shared memory handles, which can result in a large number of open FDs when many tensors are shared.
If the total number of open FDs exceeds the system limit and cannot be raised, the file_system strategy should be used instead.

This patch allows switching to the file_system strategy by setting:
  export PYTORCH_SHARE_STRATEGY='file_system'

Reference:
https://pytorch.org/docs/stable/multiprocessing.html#sharing-strategies


Reviewed By: spcyppt

Differential Revision: D86135817

Pulled By: q10
q10 pushed a commit to q10/FBGEMM that referenced this pull request Nov 4, 2025
…ytorch#5083)

Summary:
Pull Request resolved: pytorch#5083

X-link: https://github.com/facebookresearch/FBGEMM/pull/2089

When running benchmarks with a large number of copies, the process may raise:
 OSError: [Errno 24] Too many open files.

Example command:
(fbgemm_gpu_env)$ ulimit -n 1048576
(fbgemm_gpu_env)$ python ./bench/tbe/tbe_inference_benchmark.py nbit-cpu \
    --num-embeddings=40000000 --bag-size=2 --embedding-dim=96 \
    --batch-size=162 --num-tables=8 --weights-precision=int4 \
    --output-dtype=fp32 --copies=96 --iters=30000

PyTorch multiprocessing provides two shared-memory strategies: 1.file_descriptor (default)
2.file_system

The default file_descriptor strategy uses file descriptors as shared memory handles, which can result in a large number of open FDs when many tensors are shared.
If the total number of open FDs exceeds the system limit and cannot be raised, the file_system strategy should be used instead.

This patch allows switching to the file_system strategy by setting:
  export PYTORCH_SHARE_STRATEGY='file_system'

Reference:
https://pytorch.org/docs/stable/multiprocessing.html#sharing-strategies

Pull Request resolved: pytorch#5037

Reviewed By: spcyppt

Differential Revision: D86135817

Pulled By: q10
@meta-codesync meta-codesync bot closed this in 9df97a7 Nov 4, 2025
@meta-codesync
Copy link
Contributor

meta-codesync bot commented Nov 4, 2025

@q10 merged this pull request in 9df97a7.

Bernard-Liu pushed a commit to ROCm/FBGEMM that referenced this pull request Nov 11, 2025
…ytorch#5083)

Summary:
Pull Request resolved: pytorch#5083

X-link: https://github.com/facebookresearch/FBGEMM/pull/2089

When running benchmarks with a large number of copies, the process may raise:
 OSError: [Errno 24] Too many open files.

Example command:
(fbgemm_gpu_env)$ ulimit -n 1048576
(fbgemm_gpu_env)$ python ./bench/tbe/tbe_inference_benchmark.py nbit-cpu \
    --num-embeddings=40000000 --bag-size=2 --embedding-dim=96 \
    --batch-size=162 --num-tables=8 --weights-precision=int4 \
    --output-dtype=fp32 --copies=96 --iters=30000

PyTorch multiprocessing provides two shared-memory strategies: 1.file_descriptor (default)
2.file_system

The default file_descriptor strategy uses file descriptors as shared memory handles, which can result in a large number of open FDs when many tensors are shared.
If the total number of open FDs exceeds the system limit and cannot be raised, the file_system strategy should be used instead.

This patch allows switching to the file_system strategy by setting:
  export PYTORCH_SHARE_STRATEGY='file_system'

Reference:
https://pytorch.org/docs/stable/multiprocessing.html#sharing-strategies

Pull Request resolved: pytorch#5037

Reviewed By: spcyppt

Differential Revision: D86135817

Pulled By: q10

fbshipit-source-id: 15f6fe7e1de5e9fef828f5a1496dc1cf9b41c293
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants